Add action-agent runtime and generated config pipeline by skywhite1024 · Pull Request #306 · DexForce/EmbodiChain

skywhite1024 · 2026-06-15T09:05:23Z

Description

Add the action-agent runtime and generated-config pipeline for image-to-tabletop manipulation demos.

This PR introduces the reviewable vertical slice for running generated simulation tasks end to end:

TaskAgent and deterministic CompileAgent wrappers for nominal task-graph generation, validation, and execution.
JSON graph loading, schema validation, and deterministic start-to-goal graph compilation.
Atomic-action spec normalization and execution bridge to embodichain.lab.sim.atomic_actions.
Tableware env adapter for agent state, IK/FK, configured success checks, and action-list creation.
run_agent.py CLI for executing an existing generated gym_config + agent_config pair.
UR5 basket / relative-placement config generation from exported gym projects.
Image2Tabletop and prompt-to-geometry service integration helpers used by the demo pipeline.
One-command orchestration support for Demo1/Demo2/Demo3 generated configs.
Unit tests for LLM usage tracking, graph schema validation, backend atomic-action runtime integration, and generated config behavior.

Reviewer Cleanup

Addressed the review feedback around stale and invalid action-agent code:

Removed unsupported orientation from target-pose schema and added regression coverage that rejects it.
Removed stale _build_action_cfg_and_start() helper by inlining cfg/start-qpos construction.
Kept only the runtime atomic-action implementation; tableware adapter code now imports the current env registration path.
Rejected legacy/non-atomic action schemas in both runtime normalization and graph compilation.
Removed monitor/recovery compatibility leftovers from the nominal runtime path.
Removed unused compile LLM wiring while preserving existing generated config compatibility.

Fixes # N/A

Type of change

New feature (non-breaking change which adds functionality)
Cleanup/refactor in response to review feedback

Screenshots

N/A. This PR adds runtime, config generation, and CLI infrastructure without UI changes.

Validation

conda run -n embodichain040 black \
  embodichain/gen_sim/action_agent_pipeline/agents/compile_agent.py \
  embodichain/gen_sim/action_agent_pipeline/agents/llm.py \
  embodichain/gen_sim/action_agent_pipeline/cli/run_agent.py \
  embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py \
  embodichain/gen_sim/action_agent_pipeline/generation/prompt_builders.py \
  embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py \
  embodichain/gen_sim/action_agent_pipeline/runtime/graph_compiler.py \
  tests/gen_sim/action_agent_pipeline/test_backend_atomic_runtime.py \
  tests/gen_sim/action_agent_pipeline/test_demo3_semantic_grasp_integration.py \
  tests/gen_sim/action_agent_pipeline/test_graph_spec_backend_atomic.py

Result: all listed files were already formatted.

conda run -n embodichain040 python -m pytest \
  tests/gen_sim/action_agent_pipeline/test_backend_atomic_runtime.py \
  tests/gen_sim/action_agent_pipeline/test_graph_spec_backend_atomic.py \
  tests/gen_sim/action_agent_pipeline/test_llm_usage.py \
  -q

Result: 20 passed.

Additional smoke checks:

Existing generated config compatibility for CompileAgent(prompt_name="compile_agent_graph", ...).
orientation is rejected as an unsupported target-pose field.

Checklist

I have run formatting on changed files.
I have added tests that prove my fix is effective or that my feature works.
Dependencies have been updated, if applicable. No dependency update is required.
Description updated to match the current PR scope and reviewer cleanup.

Copilot

Pull request overview

Adds the first executable “action-agent” runtime slice for running already-generated agent/gym configs end-to-end: loading/validating nominal task graphs, normalizing atomic-action specs, executing them in the sim (with a tableware env adapter), plus a CLI and unit tests.

Changes:

Introduce a nominal task-graph runtime (AgentTaskGraph) and JSON bundle compiler/validator.
Add an atomic-action spec normalization + execution bridge to embodichain.lab.sim.atomic_actions.
Provide a tableware agent env adapter and a run_agent.py CLI, with unit tests for usage tracking, schema validation, and atomic runtime integration.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/gen_sim/action_agent_pipeline/test_llm_usage.py	Unit tests for LLM usage normalization/recording and wrapper behavior.
tests/gen_sim/action_agent_pipeline/test_graph_spec_backend_atomic.py	Unit tests for graph compilation accepting/rejecting action schemas.
tests/gen_sim/action_agent_pipeline/test_backend_atomic_runtime.py	Unit tests for atomic-action spec normalization and runtime execution bridge.
embodichain/gen_sim/action_agent_pipeline/utils/mllm.py	OpenAI/LangChain client factory with usage tracking wrapper.
embodichain/gen_sim/action_agent_pipeline/utils/llm_usage.py	JSONL usage logging + aggregation utilities and LangChain proxy wrapper.
embodichain/gen_sim/action_agent_pipeline/utils/llm_json.py	Helpers to extract/normalize JSON objects from LLM responses.
embodichain/gen_sim/action_agent_pipeline/utils/llm_config.py	Shared LLM config loading from env + gen_config + local .env files.
embodichain/gen_sim/action_agent_pipeline/utils/init.py	Package init for utils module.
embodichain/gen_sim/action_agent_pipeline/runtime/task_graph.py	Deterministic nominal graph runtime and executed-action list wrapper.
embodichain/gen_sim/action_agent_pipeline/runtime/graph_compiler.py	Loader/validator/compiler for nominal task graphs into runtime objects.
embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py	Atomic-action spec normalization and execution, incl. parallel arm execution.
embodichain/gen_sim/action_agent_pipeline/runtime/atom_action_utils.py	Runtime helpers for arm-side resolution and agent-state synchronization.
embodichain/gen_sim/action_agent_pipeline/runtime/init.py	Package init for runtime module.
embodichain/gen_sim/action_agent_pipeline/prompts/task_prompt.py	Task-agent prompt builder for nominal atomic-action graphs.
embodichain/gen_sim/action_agent_pipeline/prompts/basic_background.txt	Shared environment background prompt text.
embodichain/gen_sim/action_agent_pipeline/prompts/atom_actions.txt	Shared atomic-action JSON schema prompt text.
embodichain/gen_sim/action_agent_pipeline/prompts/init.py	Package init exporting `TaskPrompt`.
embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/success.py	Config-driven success predicate evaluation for tableware tasks.
embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py	Tableware agent env adapter wiring Task/Compile agents and state caching.
embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/atomic_actions.py	Gym env registration for the atomic-actions agent environment.
embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/init.py	Package init for tableware env adapters.
embodichain/gen_sim/action_agent_pipeline/env_adapters/init.py	Package init for env adapters.
embodichain/gen_sim/action_agent_pipeline/cli/run_agent.py	CLI entrypoint to execute action-agent tasks from existing configs.
embodichain/gen_sim/action_agent_pipeline/cli/init.py	Package init for CLI module.
embodichain/gen_sim/action_agent_pipeline/agents/task_agent.py	TaskAgent wrapper to generate and persist the nominal task graph.
embodichain/gen_sim/action_agent_pipeline/agents/llm.py	Shared LLM construction (safe initialization + usage-stage tagging).
embodichain/gen_sim/action_agent_pipeline/agents/compile_agent.py	CompileAgent wrapper to persist compiled graph bundles and execute them.
embodichain/gen_sim/action_agent_pipeline/agents/agent_base.py	Shared agent base for prompt file resolution/loading.
embodichain/gen_sim/action_agent_pipeline/agents/init.py	Package init for agents module.
embodichain/gen_sim/action_agent_pipeline/init.py	Package init for action-agent pipeline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        if edge.get("left_arm_action") is None and edge.get("right_arm_action") is None:
+            raise ValueError(f"Nominal edge '{edge_id}' must define an arm action.")


+def _pose(env, uid: str) -> torch.Tensor:
+    return env.sim.get_rigid_object(uid).get_local_pose(to_matrix=True)


+def extract_drive_calls(code_str: str) -> List[str]:
+    """Extract all drive() function calls from a code string.
+
+    Args:
+        code_str: Python code string to parse.


+        runtime_kwargs = _runtime_kwargs(kwargs, getattr(self, "prompt_kwargs", {}))
+        graph = compile_agent_graph_from_file(graph_file_path)
+        result = graph.run(**runtime_kwargs)
+        print("Compiled agent graph executed successfully.")
+        return result


XuanchaoPENG · 2026-06-16T06:59:42Z

use_place_action is the invalid param.
duplicate names of atomic actions under action_agent_pipeline/env_adapters/tableware and action_agent_pipeline/runtime. only keep one implement of atomic actions.
invalid code like sync_agent_state_from_robot, extract_drive_calls, etc.
the action edge contains the non-atomic action.
invalid functions like _build_action_cfg_and_start() in action_agent_pipeline/runtime/atom_actions.py.
orientation seems like no use in prompt.

skywhite1024 · 2026-06-17T01:57:54Z

Thanks for the detailed review. I updated the PR branch and addressed the action-agent cleanup items:

Removed use_place_action and verified it no longer appears in action_agent_pipeline.
Kept only the runtime atomic-action implementation; the tableware adapter now imports the current env registration path, and the Demo3 semantic grasp integration test imports agent_env instead of the removed tableware.atomic_actions module.
Removed stale/invalid leftovers including _build_action_cfg_and_start, monitor return fields, monitor runtime args, and unused compile LLM wiring.
The graph compiler and runtime now reject legacy/non-atomic action schemas (action, fn/kwargs, unsupported edge fields).
Removed orientation from the supported target-pose schema and added regression coverage that rejects it as an unsupported field.

Validation in embodichain040:

conda run -n embodichain040 python -m pytest \
  tests/gen_sim/action_agent_pipeline/test_backend_atomic_runtime.py \
  tests/gen_sim/action_agent_pipeline/test_graph_spec_backend_atomic.py \
  tests/gen_sim/action_agent_pipeline/test_llm_usage.py \
  -q

Result: 20 passed.

I also updated the PR description because the PR branch has now been synced to the current local branch, so the scope is broader than the original runtime-only slice and includes generated config / demo pipeline support.

yuecideng · 2026-06-19T05:26:21Z

Code review

Branch: 17 commits, 57 files, +17,269 LOC. The PR adds the action-agent runtime + generated-config pipeline for image-to-tabletop manipulation. The cleanup items from the prior review (duplicate atomic-actions, use_place_action, stale helpers) were verified as resolved.

Good design patterns

Clean three-layer separation in runtime/ — graph_compiler.py (JSON→graph), task_graph.py (pure graph + linear traversal), atom_actions.py (execution) each own one concern; the compiler injects graph_cls/action_module so layers are independently testable.
Single schema boundary — AtomicActionSpec (frozen dataclass) + normalize_atomic_action_spec is the one validation entry; legacy schemas (fn, action, target) are rejected with actionable messages.
Compile-time path validation — _validate_nominal_path rejects duplicate node/edge ids, dangling refs, cycles, and unreachable goals before runtime, so AgentTaskGraph.run() can assume a clean linear path.
Compiled-graph cache with hash + schema version — SHA-256 of canonical sorted-key JSON + COMPILED_GRAPH_SCHEMA_VERSION; stale artifacts self-heal.
Content-addressed CoACD mesh cache — keys cache on source_sha256 + policy_version + dexsim_engine_version + transform, so policy bumps auto-invalidate.
Consistent HTTP error envelope across SAM3/SAM3D/LLM clients — explicit timeout_s, typed errors, JSON-shape validation.
Zip-slip protection in _safe_extract_zip.
Declarative compositional success evaluator — recursive all/any/not over a leaf-predicate catalog.
Non-invasive LLM usage tracking — thin proxy over LangChain, process-local env vars, scrubbed from subprocess envs.
Network-gated tests — pytest.mark.skipif(... RUN_DEXSIM_GRASP_TESTS ...) and pytest.importorskip(...) keep the suite green without services.

Points to improve

Correctness / robustness

Schema/sentinel mismatch in graph_compiler.py — _validate_task_spec only treats JSON null as empty, but _compile_action also coerces ""/"none"/"null". An edge like {"left_arm_action": "", "right_arm_action": "none"} passes validation, compiles to both-None, and fails later in execute_parallel_atomic_actions with a worse error.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/graph_compiler.py

Lines 156 to 160 in 24901bd

    
           edge_ids.add(edge_id) 
        
           if edge.get("left_arm_action") is None and edge.get("right_arm_action") is None: 
        
               raise ValueError(f"Nominal edge '{edge_id}' must define an arm action.") 
        
           for node_key in ("source", "target"):

success.py::_pose() crashes opaquely on bad uid — env.sim.get_rigid_object(uid) returns None for a typoed object id, then .get_local_pose(...) raises AttributeError. Every leaf predicate funnels through _pose. Add an explicit None check raising ValueError(f"Unknown rigid object uid: {uid!r}").

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/success.py

Lines 100 to 103 in 24901bd

    
           def _pose(env, uid: str) -> torch.Tensor: 
        
               return env.sim.get_rigid_object(uid).get_local_pose(to_matrix=True)

Non-atomic cache writes — coacd_cache_bridge.py (pickle.dump to final path) and coacd_cache.py (CoACD .obj write). An interrupt leaves a corrupt file that future readers silently consume. Use a sibling temp + os.replace().

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/coacd_cache_bridge.py

Lines 174 to 183 in 24901bd

    
           with cache_path.open("wb") as cache_file: 
        
               pickle.dump( 
        
                   { 
        
                       "plane_equations": plane_equations, 
        
                       "plane_equation_counts": plane_equation_counts, 
        
                   }, 
        
                   cache_file, 
        
               )

Unbounded retries / polls — prompt2geometry/pipeline.py (while True with bare except Exception on JSON extraction) and image2tabletop_client.py::wait_for_job (polls forever with no deadline). Add max-attempts + backoff + timeout.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/gym_project_api/prompt2geometry/pipeline.py

Lines 334 to 342 in 24901bd

    
           ] 
        
           while True: 
        
               try: 
        
                   raw = client.chat_json(messages=messages) 
        
                   return _validate_glb_stem_output(raw) 
        
               except Exception: 
        
                   time.sleep(1.0) 
        
                   continue

Silent exception swallowing — atom_actions.py:957 (except Exception: return on grasp-collision cache prep — bugs get masked, collision checking silently degrades); agents/llm.py::_create_llm_safe returns None with no log — root cause lost. Narrow clauses and log.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 956 to 958 in 24901bd

    
               ) 
        
           except Exception: 
        
               return

resolve_arm_side doesn't raise — calls log_error(error_type=ValueError) but log_error only logs; the invalid side is returned and downstream _select_arm_parts produces confusing shape errors.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_action_utils.py

Lines 40 to 48 in 24901bd

    
           if side not in _available_arm_sides(env): 
        
               log_error( 
        
                   f"Requested {side}_arm for robot_name='{robot_name}', but available " 
        
                   f"control parts are {getattr(env.robot, 'control_parts', None)}.", 
        
                   error_type=ValueError, 
        
               ) 
        
           return side

task_agent.py cache has no input hash — unlike CompileAgent, a stale agent_task_graph.json is loaded silently when prompt/task_name changes but the file path doesn't. Add a prompt-version or input-hash check.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/agents/task_agent.py

Lines 55 to 59 in 24901bd

    
           if not kwargs.get("regenerate", False) and file_path.exists(): 
        
               print(f"Task graph already exists at {file_path}.") 
        
               return load_txt(file_path)

env=None defaults on required params — execute_parallel_atomic_actions and AgentTaskGraph.run. Env is dereferenced immediately; None produces an opaque AttributeError. Make required.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 436 to 440 in 24901bd

    
               right_arm_action=None, 
        
               env=None, 
        
               return_result: bool = False, 
        
               **runtime_kwargs, 
        
           ):

utils/mllm.py::apply_proxy_env mutates os.environ globally as a side effect of create_openai_client/create_chat_openai — leaks across tests and parallel workers. Pass http_client=/proxies to constructors instead.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/utils/mllm.py

Lines 39 to 46 in 24901bd

    
           def apply_proxy_env(proxy_url: str | None) -> None: 
        
               """Apply an optional proxy URL for OpenAI-compatible clients.""" 
        
               if not proxy_url: 
        
                   return 
        
               os.environ["HTTP_PROXY"] = proxy_url 
        
               os.environ["HTTPS_PROXY"] = proxy_url

Structure / contracts

ur5_basket_config.py is 3,665 lines holding ≥6 separable concerns (scene parsing, target replacement, LLM role refinement, relative-placement geometry, GLB/glTF parsing, static robot/sensor/light templates). At minimum split out a glb_io.py (note: _read_glb in this file duplicates mesh_frame_normalization.py::_read_glb) and move static config templates to YAML/JSON data files.

https://github.com/DexForce/EmbodiChain/blob/24901bddef40c7fe69cae4e79d4d235aa4d73462/embodichain/gen_sim/action_agent_pipeline/generation/ur5_basket_config.py

run_agent_pipeline.py is 1,329 lines mixing arg parsing, two subprocess stages, rigid-object alias resolution, history lookup, and manifest writing. Split: target-replacement resolution, image2scene stage, arg parser + _DEFAULT_* constants into separate modules.

https://github.com/DexForce/EmbodiChain/blob/24901bddef40c7fe69cae4e79d4d235aa4d73462/embodichain/gen_sim/action_agent_pipeline/cli/run_agent_pipeline.py

base_agent_env.py::_init_agents fragile to reserved keys — _agent_config_with_prompt_keys strips prompt_kwargs but not task_name/config_dir; an Agent config block with either key raises TypeError: got multiple values for keyword argument.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py

Lines 47 to 61 in 24901bd

    
           ) 
        
           self.task_agent = TaskAgent( 
        
               task_llm, 
        
               **task_agent_config, 
        
               **agent_config["TaskAgent"], 
        
               task_name=task_name, 
        
               config_dir=agent_config_path, 
        
           ) 
        
           self.compile_agent = CompileAgent( 
        
               **compile_agent_config, 
        
               **agent_config["CompileAgent"], 
        
               task_name=task_name, 
        
               config_dir=agent_config_path, 
        
           )

agent_env.py::__init__ double-consumes **kwargs — forwards to both EmbodiedEnv.__init__ and _init_agents. The contract is fragile and undocumented.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/agent_env.py

Lines 36 to 42 in 24901bd

    
           def __init__(self, cfg: EmbodiedEnvCfg = None, **kwargs): 
        
               super().__init__(cfg, **kwargs) 
        
               if bool(getattr(self, "ignore_terminations_during_agent", False)): 
        
                   self.cfg.ignore_terminations = True 
        
               super()._init_agents(**kwargs)

execute_parallel_atomic_actions couples build + step + UI — root of the previously-flagged "executes immediately" concern. Split into build_parallel_action_stream(...) (pure) and step_env_with_actions(env, actions).

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 487 to 491 in 24901bd

    
           for action in tqdm(actions): 
        
               env.step(action) 
        
               env.update_obj_info()

Redundant double-normalization — _compile_action normalizes via normalize_atomic_action_spec and stores a dict; execute_atomic_action re-normalizes on every execution via AtomicActionSpec.from_mapping. Cache the AtomicActionSpec at compile time.
Hardcoded intranet IPs as defaults — 192.168.3.23:{5013,5015,5016,4523} in pipeline.py, image2tabletop_client.py, zimage_client.py, run_agent_pipeline.py. Make required or env-only.
utils/__init__.py is empty and missing __all__ — AGENTS.md requires __all__ in every public module. All other __init__.py files are populated correctly; every source file has the Apache 2.0 header and from __future__ import annotations.
Inconsistent timestamps — pipeline records use local time / second precision; usage logs use UTC / millisecond precision. Pick one convention.
test_backend_atomic_runtime.py::_FakeBackendAction.capture is a class-level mutable shared across instances. Reset via fixture or make it instance-scoped.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

yuecideng · 2026-06-19T07:30:02Z

Module structure: `agents/hierarchy/` vs `gen_sim/action_agent_pipeline/agents/`

The two modules are partially redundant — not duplicates of the same feature, but an in-progress migration from "LLM generates Python code that is exec()'d" (old agents/hierarchy/) to "LLM emits a JSON task graph that is compiled and executed as atomic actions" (new gen_sim/.../agents/).

Redundancy map

File	`agents/hierarchy/`	`gen_sim/.../agents/`	Verdict
`agent_base.py`	`AgentBase` ABC + `_resolve_prompt_path` (95 lines)	`AgentBase` ABC + `_resolve_prompt_path` (95 lines)	Near-verbatim duplicate. Only diffs: gen_sim adds `from __future__ import annotations` + PEP 604 union, and `get_composed_observations()` drops the `env.get_obs_for_agent()` coupling. `_resolve_prompt_path` is byte-identical.
`task_agent.py`	Emits free-text plan → `agent_generated_plan.txt`	Emits JSON task graph → `agent_task_graph.json`	Same skeleton (constructor, `generate()` flow, `act()` pass-through), different output contract.
`llm.py`	Azure factory + 4 module-level instances	OpenAI factory via shared `utils/mllm.py` + 1 instance	Redundant factory. Gen_sim already factored the shared factory out to `utils/mllm.py`; hierarchy still hardcodes Azure.
`code_agent.py`	LLM writes Python → `exec()` with AST kwargs injection (289 lines)	—	Unique to old design; replaced by `CompileAgent` + `graph_compiler`.
`validation_agent.py`	Vision-LLM execution validator + view selector (241 lines)	—	Unique to old design; new pipeline uses declarative `env_adapters/.../success.py` predicates instead.

External usage

agents/hierarchy/ is imported only from lab/gym/envs/tasks/tableware/base_agent_env.py:23-26 (old tableware task).
gen_sim/.../agents/ is imported only from gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py:30-36 (new gen_sim path).

There's a parallel old/new split at the env-adapter layer too. Note agents/hierarchy/code_agent.py:216 imports embodichain.lab.sim.atom_actions (singular, old) while the new pipeline uses embodichain.lab.sim.atomic_actions/ (plural package, new) — same generational split exists at the sim layer.

Improvement options

Finish the migration, delete agents/hierarchy/. Migrate the one consumer (lab/gym/envs/tasks/tableware/base_agent_env.py) to the gen_sim pipeline, then remove agents/hierarchy/ entirely. ValidationAgent (vision-LLM success checking) has no replacement, so either port it to env_adapters/.../success.py or accept losing that capability.
Extract shared base to agents/hierarchy/_base.py, have both pipelines inherit. Removes the AgentBase + _resolve_prompt_path + LLM-factory duplication. Tradeoff: introduces cross-package coupling (gen_sim would depend on agents/hierarchy/ or vice versa); only worth it if Option 1 is blocked.
Leave as-is, document the split. Add a deprecation docstring at the top of agents/hierarchy/ marking it as superseded-by-gen_sim.

Recommendation: Option 1 if the old tableware task is still actively used; Option 3 as a holding pattern if it's already abandoned. Option 2 is a trap — it locks in the duplication by making it structural rather than temporary.

The real question for the PR author: is the old code_agent (LLM-writes-Python) path still needed for anything the JSON-graph path can't do yet? If not, finish the migration in a follow-up and delete agents/hierarchy/.

🤖 Generated with Claude Code

skywhite1024 · 2026-06-24T03:31:14Z

Code review

Branch: 17 commits, 57 files, +17,269 LOC. The PR adds the action-agent runtime + generated-config pipeline for image-to-tabletop manipulation. The cleanup items from the prior review (duplicate atomic-actions, use_place_action, stale helpers) were verified as resolved.

Good design patterns

Clean three-layer separation in runtime/ — graph_compiler.py (JSON→graph), task_graph.py (pure graph + linear traversal), atom_actions.py (execution) each own one concern; the compiler injects graph_cls/action_module so layers are independently testable.

Single schema boundary — AtomicActionSpec (frozen dataclass) + normalize_atomic_action_spec is the one validation entry; legacy schemas (fn, action, target) are rejected with actionable messages.

Compile-time path validation — _validate_nominal_path rejects duplicate node/edge ids, dangling refs, cycles, and unreachable goals before runtime, so AgentTaskGraph.run() can assume a clean linear path.

Compiled-graph cache with hash + schema version — SHA-256 of canonical sorted-key JSON + COMPILED_GRAPH_SCHEMA_VERSION; stale artifacts self-heal.

Content-addressed CoACD mesh cache — keys cache on source_sha256 + policy_version + dexsim_engine_version + transform, so policy bumps auto-invalidate.

Consistent HTTP error envelope across SAM3/SAM3D/LLM clients — explicit timeout_s, typed errors, JSON-shape validation.

Zip-slip protection in _safe_extract_zip.

Declarative compositional success evaluator — recursive all/any/not over a leaf-predicate catalog.

Non-invasive LLM usage tracking — thin proxy over LangChain, process-local env vars, scrubbed from subprocess envs.

Network-gated tests — pytest.mark.skipif(... RUN_DEXSIM_GRASP_TESTS ...) and pytest.importorskip(...) keep the suite green without services.

Points to improve

Correctness / robustness

Schema/sentinel mismatch in graph_compiler.py — _validate_task_spec only treats JSON null as empty, but _compile_action also coerces ""/"none"/"null". An edge like {"left_arm_action": "", "right_arm_action": "none"} passes validation, compiles to both-None, and fails later in execute_parallel_atomic_actions with a worse error.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/graph_compiler.py

Lines 156 to 160 in 24901bd

edge_ids.add(edge_id)

if edge.get("left_arm_action") is None and edge.get("right_arm_action") is None:

raise ValueError(f"Nominal edge '{edge_id}' must define an arm action.")

for node_key in ("source", "target"):

success.py::_pose() crashes opaquely on bad uid — env.sim.get_rigid_object(uid) returns None for a typoed object id, then .get_local_pose(...) raises AttributeError. Every leaf predicate funnels through _pose. Add an explicit None check raising ValueError(f"Unknown rigid object uid: {uid!r}").

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/success.py

Lines 100 to 103 in 24901bd

def _pose(env, uid: str) -> torch.Tensor:

return env.sim.get_rigid_object(uid).get_local_pose(to_matrix=True)

Non-atomic cache writes — coacd_cache_bridge.py (pickle.dump to final path) and coacd_cache.py (CoACD .obj write). An interrupt leaves a corrupt file that future readers silently consume. Use a sibling temp + os.replace().

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/coacd_cache_bridge.py

Lines 174 to 183 in 24901bd

with cache_path.open("wb") as cache_file:

pickle.dump(

{

"plane_equations": plane_equations,

"plane_equation_counts": plane_equation_counts,

},

cache_file,

)

Unbounded retries / polls — prompt2geometry/pipeline.py (while True with bare except Exception on JSON extraction) and image2tabletop_client.py::wait_for_job (polls forever with no deadline). Add max-attempts + backoff + timeout.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/gym_project_api/prompt2geometry/pipeline.py

Lines 334 to 342 in 24901bd

]

while True:

try:

raw = client.chat_json(messages=messages)

return _validate_glb_stem_output(raw)

except Exception:

time.sleep(1.0)

continue

Silent exception swallowing — atom_actions.py:957 (except Exception: return on grasp-collision cache prep — bugs get masked, collision checking silently degrades); agents/llm.py::_create_llm_safe returns None with no log — root cause lost. Narrow clauses and log.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 956 to 958 in 24901bd

)

except Exception:

return

resolve_arm_side doesn't raise — calls log_error(error_type=ValueError) but log_error only logs; the invalid side is returned and downstream _select_arm_parts produces confusing shape errors.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_action_utils.py

Lines 40 to 48 in 24901bd

if side not in _available_arm_sides(env):

log_error(

f"Requested {side}_arm for robot_name='{robot_name}', but available "

f"control parts are {getattr(env.robot, 'control_parts', None)}.",

error_type=ValueError,

)

return side

task_agent.py cache has no input hash — unlike CompileAgent, a stale agent_task_graph.json is loaded silently when prompt/task_name changes but the file path doesn't. Add a prompt-version or input-hash check.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/agents/task_agent.py

Lines 55 to 59 in 24901bd

if not kwargs.get("regenerate", False) and file_path.exists():

print(f"Task graph already exists at {file_path}.")

return load_txt(file_path)

env=None defaults on required params — execute_parallel_atomic_actions and AgentTaskGraph.run. Env is dereferenced immediately; None produces an opaque AttributeError. Make required.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 436 to 440 in 24901bd

right_arm_action=None,

env=None,

return_result: bool = False,

**runtime_kwargs,

):

utils/mllm.py::apply_proxy_env mutates os.environ globally as a side effect of create_openai_client/create_chat_openai — leaks across tests and parallel workers. Pass http_client=/proxies to constructors instead.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/utils/mllm.py

Lines 39 to 46 in 24901bd

def apply_proxy_env(proxy_url: str | None) -> None:

"""Apply an optional proxy URL for OpenAI-compatible clients."""

if not proxy_url:

return

os.environ["HTTP_PROXY"] = proxy_url

os.environ["HTTPS_PROXY"] = proxy_url

Structure / contracts

ur5_basket_config.py is 3,665 lines holding ≥6 separable concerns (scene parsing, target replacement, LLM role refinement, relative-placement geometry, GLB/glTF parsing, static robot/sensor/light templates). At minimum split out a glb_io.py (note: _read_glb in this file duplicates mesh_frame_normalization.py::_read_glb) and move static config templates to YAML/JSON data files.

https://github.com/DexForce/EmbodiChain/blob/24901bddef40c7fe69cae4e79d4d235aa4d73462/embodichain/gen_sim/action_agent_pipeline/generation/ur5_basket_config.py

run_agent_pipeline.py is 1,329 lines mixing arg parsing, two subprocess stages, rigid-object alias resolution, history lookup, and manifest writing. Split: target-replacement resolution, image2scene stage, arg parser + _DEFAULT_* constants into separate modules.

https://github.com/DexForce/EmbodiChain/blob/24901bddef40c7fe69cae4e79d4d235aa4d73462/embodichain/gen_sim/action_agent_pipeline/cli/run_agent_pipeline.py

base_agent_env.py::_init_agents fragile to reserved keys — _agent_config_with_prompt_keys strips prompt_kwargs but not task_name/config_dir; an Agent config block with either key raises TypeError: got multiple values for keyword argument.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py

Lines 47 to 61 in 24901bd

)

self.task_agent = TaskAgent(

task_llm,

**task_agent_config,

**agent_config["TaskAgent"],

task_name=task_name,

config_dir=agent_config_path,

)

self.compile_agent = CompileAgent(

**compile_agent_config,

**agent_config["CompileAgent"],

task_name=task_name,

config_dir=agent_config_path,

)

agent_env.py::__init__ double-consumes **kwargs — forwards to both EmbodiedEnv.__init__ and _init_agents. The contract is fragile and undocumented.

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/agent_env.py

Lines 36 to 42 in 24901bd

def __init__(self, cfg: EmbodiedEnvCfg = None, **kwargs):

super().__init__(cfg, **kwargs)

if bool(getattr(self, "ignore_terminations_during_agent", False)):

self.cfg.ignore_terminations = True

super()._init_agents(**kwargs)

execute_parallel_atomic_actions couples build + step + UI — root of the previously-flagged "executes immediately" concern. Split into build_parallel_action_stream(...) (pure) and step_env_with_actions(env, actions).

EmbodiChain/embodichain/gen_sim/action_agent_pipeline/runtime/atom_actions.py

Lines 487 to 491 in 24901bd

for action in tqdm(actions):

env.step(action)

env.update_obj_info()

Redundant double-normalization — _compile_action normalizes via normalize_atomic_action_spec and stores a dict; execute_atomic_action re-normalizes on every execution via AtomicActionSpec.from_mapping. Cache the AtomicActionSpec at compile time.

Hardcoded intranet IPs as defaults — 192.168.3.23:{5013,5015,5016,4523} in pipeline.py, image2tabletop_client.py, zimage_client.py, run_agent_pipeline.py. Make required or env-only.

utils/__init__.py is empty and missing __all__ — AGENTS.md requires __all__ in every public module. All other __init__.py files are populated correctly; every source file has the Apache 2.0 header and from __future__ import annotations.

Inconsistent timestamps — pipeline records use local time / second precision; usage logs use UTC / millisecond precision. Pick one convention.

test_backend_atomic_runtime.py::_FakeBackendAction.capture is a class-level mutable shared across instances. Reset via fixture or make it instance-scoped.

🤖 Generated with Claude Code

If this code review was useful, please react with 👍. Otherwise, react with 👎.

Thanks for the thorough review. I addressed the cleanup items in the latest update:

Aligned graph validation with runtime action normalization, so "", "none", and "null" are rejected at compile time when both arms are empty.
Added an explicit unknown-object check in success._pose().
Switched CoACD/grasp cache writes to temp-file + os.replace().
Added bounded retry/timeout behavior for prompt2geometry naming and Image2Tabletop polling.
Narrowed/logged grasp-cache fallback exceptions and logged LLM initialization failures.
Made resolve_arm_side() raise clear ValueErrors.
Added prompt-hash metadata for TaskAgent cache validation.
Made env required for graph/action execution paths.
Changed proxy handling to per-client httpx.Client instead of mutating os.environ.
Split the large generation and CLI modules into focused files, including glb_io.py, config/template helpers, target replacement, image2scene stage, parser/defaults, and pipeline runner modules.
Split action stream building from environment stepping via build_parallel_action_stream() and step_env_with_actions().
Compile-time action normalization now stores AtomicActionSpec objects to avoid redundant normalization.
Removed hardcoded intranet service defaults and kept service URLs env/argument driven.
Added __all__ and aligned timestamp records to UTC milliseconds.
Reset the fake backend capture through an autouse fixture.

turn 180

skywhite1024 · 2026-06-28T10:53:37Z

Module structure: agents/hierarchy/ vs gen_sim/action_agent_pipeline/agents/

The two modules are partially redundant — not duplicates of the same feature, but an in-progress migration from "LLM generates Python code that is exec()'d" (old agents/hierarchy/) to "LLM emits a JSON task graph that is compiled and executed as atomic actions" (new gen_sim/.../agents/).

Redundancy map

File agents/hierarchy/ gen_sim/.../agents/ Verdict
agent_base.py AgentBase ABC + _resolve_prompt_path (95 lines) AgentBase ABC + _resolve_prompt_path (95 lines) Near-verbatim duplicate. Only diffs: gen_sim adds from __future__ import annotations + PEP 604 union, and get_composed_observations() drops the env.get_obs_for_agent() coupling. _resolve_prompt_path is byte-identical.
task_agent.py Emits free-text plan → agent_generated_plan.txt Emits JSON task graph → agent_task_graph.json Same skeleton (constructor, generate() flow, act() pass-through), different output contract.
llm.py Azure factory + 4 module-level instances OpenAI factory via shared utils/mllm.py + 1 instance Redundant factory. Gen_sim already factored the shared factory out to utils/mllm.py; hierarchy still hardcodes Azure.
code_agent.py LLM writes Python → exec() with AST kwargs injection (289 lines) — Unique to old design; replaced by CompileAgent + graph_compiler.
validation_agent.py Vision-LLM execution validator + view selector (241 lines) — Unique to old design; new pipeline uses declarative env_adapters/.../success.py predicates instead.

External usage

agents/hierarchy/ is imported only from lab/gym/envs/tasks/tableware/base_agent_env.py:23-26 (old tableware task).

gen_sim/.../agents/ is imported only from gen_sim/action_agent_pipeline/env_adapters/tableware/base_agent_env.py:30-36 (new gen_sim path).

There's a parallel old/new split at the env-adapter layer too. Note agents/hierarchy/code_agent.py:216 imports embodichain.lab.sim.atom_actions (singular, old) while the new pipeline uses embodichain.lab.sim.atomic_actions/ (plural package, new) — same generational split exists at the sim layer.

Improvement options

Finish the migration, delete agents/hierarchy/. Migrate the one consumer (lab/gym/envs/tasks/tableware/base_agent_env.py) to the gen_sim pipeline, then remove agents/hierarchy/ entirely. ValidationAgent (vision-LLM success checking) has no replacement, so either port it to env_adapters/.../success.py or accept losing that capability.

Extract shared base to agents/hierarchy/_base.py, have both pipelines inherit. Removes the AgentBase + _resolve_prompt_path + LLM-factory duplication. Tradeoff: introduces cross-package coupling (gen_sim would depend on agents/hierarchy/ or vice versa); only worth it if Option 1 is blocked.

Leave as-is, document the split. Add a deprecation docstring at the top of agents/hierarchy/ marking it as superseded-by-gen_sim.

Recommendation: Option 1 if the old tableware task is still actively used; Option 3 as a holding pattern if it's already abandoned. Option 2 is a trap — it locks in the duplication by making it structural rather than temporary.

The real question for the PR author: is the old code_agent (LLM-writes-Python) path still needed for anything the JSON-graph path can't do yet? If not, finish the migration in a follow-up and delete agents/hierarchy/.

🤖 Generated with Claude Code

Thanks for calling this out. I finished the migration and removed the legacy Python-code-generation action-agent path.
What changed:

Deleted embodichain/agents/hierarchy/ and the old prompt files used only by that path.
Removed the legacy singular embodichain.lab.sim.atom_actions module.
Removed the old PourWaterAgent-v3 and RearrangementAgent-v3 registrations that depended on BaseAgentEnv.
Kept the non-agent task envs (PourWater-v3, Rearrangement-v3) intact.
Updated API docs so Sphinx no longer imports the removed legacy modules.
The supported demos now go only through embodichain.gen_sim.action_agent_pipeline and AtomicActionsAgent-v3.

yuecideng

Thanks for the refactor — the new action-agent runtime and generated-config pipeline is a solid architectural improvement. It removes arbitrary code execution, adds deterministic graph compilation, and the test coverage is good.

However, I consolidated reviews from several focused passes and there are a number of issues that should be addressed before merging. I left inline comments on the critical ones; the rest are summarized below.

Critical (must fix)

CompileAgent bypasses AgentBase.__init__ (agents/compile_agent.py:43) — skips config_dir resolution and prompt preloading.
CompileAgent.get_composed_observations breaks Liskov substitution (agents/compile_agent.py:96) — returns dict(kwargs) instead of merged prompt contents.
Infinite retry in estimate_real_dimensions (gym_project_api/prompt2geometry/dimensions.py:74) — max_attempts=None hangs on persistent LLM failures.
Destructive _cleanup_output_root (gym_project_api/prompt2geometry/pipeline.py:424) — can delete arbitrary directory contents and follow symlinks outside the target.
Untyped cfg default in AgenticGenSimEnv.__init__ (env_adapters/tableware/agent_env.py:43) — should be EmbodiedEnvCfg | None.
Agent config keys indexed before validation (env_adapters/tableware/agent_env.py:63) — raises KeyError instead of ValueError.
Silent num_envs == 1 assumption (env_adapters/tableware/agent_env.py:117) — .squeeze(0) will break batched envs.
False-positive lifted success (env_adapters/tableware/success.py:222) — missing initial height falls back to current pose.
GLTF normalized accessor ignored (generation/mesh_bounds.py:370) — corrupts bounds for quantized meshes.
MD5 cache key (generation/coacd_cache.py:56) — disallowed by many security baselines; use SHA-256.

Important (should fix)

Core / agents

Fix agents/__init__.py __all__ to export classes (AgentBase, TaskAgent, CompileAgent, create_llm), not module names.
Replace assert with explicit exceptions and mark generate/act as @abstractmethod in AgentBase.
Add type hints to agents/llm.py and untyped public functions in runtime/atom_actions.py.
Fix _prepare_grasp_collision_cache_from_env_coacd swallowing broad exceptions.
Harden coacd_cache_bridge pickle cache (shared writable pickle is a code-execution vector).
Reuse MotionGenerator across atomic actions instead of creating one per call.

CLI / API

Fix "None" API-key string in prompt2geometry/config.py (str(None) becomes "None").
Fix target_replacement1/2 help text vs. selection logic mismatch and unify nargs semantics between pipeline and standalone generator.
Expose or document missing flags in pipeline_runner.py.
Fix run_agent.py passing int to action_sentence.
Validate URL scheme in image2tabletop_client.py.
Type pipeline_records.py and target_replacements.py properly.

Generation

Consolidate duplicate _make_relative_events_config / _make_arrangement_events_config.
Extract shared arrangement/stacking bundle builder.
Derive observation joint_ids from robot config (currently hard-coded DualUR5).
Distinguish mesh load errors from missing dependencies in _load_mesh_vertices.
Derive repo root from gym_config_path, not Path.cwd().
Unify basket bundle validator with shared UID checks.
Skip texture extraction when normalized OBJ cache is valid.
Use dataclasses.replace in _replace_relative_spec_placements.

Tests

Enable/document opt-in simulation integration tests in CI.
Move success-predicate tests from test_ur5_basket_config_generation.py to test_tableware_success.py and add failure cases.
Add a real dual-arm execute_parallel_atomic_actions test.
Split the ~3,500-line test_ur5_basket_config_generation.py.

Conventions / minor

Replace print() calls with the project logger.
Fix the license header in rearrangement.py.
Move shebangs below license headers.
Add missing docstrings to public builder functions.

Recommendation

Address the critical inline comments first, then the important items. Once the critical and most important issues are fixed and tests remain green, this should be ready for another review pass.

yuecideng · 2026-06-29T14:43:24Z

+    query_suffix = "."
+    prompt_kwargs: dict[str, dict[str, Any]]
+
+    def __init__(self, **kwargs) -> None:


CompileAgent.init bypasses AgentBase.init, so config_dir resolution and prompt preloading are skipped. Start with super().__init__(**kwargs) and remove the duplicated setattr loop.

yuecideng · 2026-06-29T14:43:24Z

+        print("Compiled agent graph executed successfully.")
+        return result
+
+    def get_composed_observations(self, **kwargs):


This overrides get_composed_observations with different semantics (dict(kwargs) instead of merging prompt contents), breaking Liskov substitution. Align with AgentBase or rename the method.

yuecideng · 2026-06-29T14:43:24Z

+    *,
+    object_prompt: str,
+    client: OpenAICompatibleClient,
+    max_attempts: int | None = None,


max_attempts=None makes the retry loop infinite when the LLM repeatedly returns invalid JSON. Set a finite default (e.g., 3–5) or require callers to pass one.

yuecideng · 2026-06-29T14:43:24Z

+    )
+
+
+def _cleanup_output_root(output_root: Path, *, keep_path: Path | None) -> None:


_cleanup_output_root can delete arbitrary directory contents and follows directory symlinks outside output_root. Add path validation (ensure it's a dedicated subdirectory), skip symlinks whose targets are outside output_root, and consider requiring --allow-cleanup.

yuecideng · 2026-06-29T14:43:24Z

+class AgenticGenSimEnv(EmbodiedEnv):
+    """Config-driven agent environment for atomic-action tasks."""
+
+    def __init__(self, cfg: EmbodiedEnvCfg = None, **kwargs):


cfg: EmbodiedEnvCfg = None is not type-safe. Use cfg: EmbodiedEnvCfg | None = None, and type reset similarly (options: dict[str, Any] | None = None) with an explicit return type.

yuecideng · 2026-06-29T14:43:24Z

+        fail = torch.zeros_like(success)
+        return success, fail, {}
+
+    def _init_agents(self, agent_config, task_name, agent_config_path=None):


_init_agents indexes agent_config['Agent'], ['TaskAgent'], ['CompileAgent'] before validating that those keys exist. A malformed config raises KeyError instead of the intended ValueError. Validate keys first, then pass the section dicts.

yuecideng · 2026-06-29T14:43:24Z

+        return filtered
+
+    def get_states(self):
+        # TODO: only support num_env = 1 for now


get_states calls .squeeze(0) on robot state, silently assuming num_envs == 1. Add an explicit guard that raises if num_envs != 1, or implement proper batched support.

yuecideng · 2026-06-29T14:43:24Z

+    )
+
+
+def _object_lifted(env, spec: Mapping[str, Any]) -> torch.Tensor:


_object_lifted falls back to position[:, 2] as the initial height when obj_info lacks the object. That makes min_height always satisfied and causes false-positive success. Raise an error or load the true initial height from the environment config.

yuecideng · 2026-06-29T14:43:24Z

+    if int(buffer_view.get("buffer", 0)) != 0:
+        raise ValueError("Only GLB embedded binary buffers are supported.")
+
+    stride = int(buffer_view.get("byteStride", component_size * component_count))


_iter_gltf_accessor_vec3 ignores the GLTF accessor normalized flag. For normalized integer accessors, scale component values by the normalization factor or bounding-box computations will be corrupted.

yuecideng · 2026-06-29T14:43:24Z

+    )
+
+
+def dexsim_coacd_cache_key_for_mesh(


Cache key uses hashlib.md5, which is disallowed by many security baselines. Prefer hashlib.sha256 (or blake2b) for new code.

yuecideng · 2026-06-29T14:47:16Z

+        ),
+    )
+    parser.add_argument(
+        "--target_replacement1",


target_replacement should be refactor to support 0-N objects replacement. Also, the target is defined as the foregraound intractive objects

yuecideng · 2026-06-29T15:04:39Z

@@ -14,6 +14,6 @@
 # limitations under the License.
 # ----------------------------------------------------------------------------



Why put so mang files under this folder? CLI usually contains only the entry points of a functionality.

yuecideng · 2026-06-29T15:07:00Z

@@ -4,12 +4,12 @@
 # All rights reserved.
 # ----------------------------------------------------------------------------

-from typing import Dict, Optional
+from __future__ import annotations


Can you still have these two envs run with agent?

yuecideng · 2026-06-29T15:58:10Z

+            "mode": "modify",
+            "name": "robot/qpos",
+            "params": {
+                "joint_ids": [12, 13, 14, 15],


Hardcoded joint_ids: [12, 13, 14, 15] is DualUR5-specific. Derive these from the robot config so the observation setup works for other arms.

yuecideng · 2026-06-29T15:58:12Z

+    "get_openai_compatible_llm_config",
+]
+
+DEFAULT_LLM_MODEL = "gpt-4o"


DEFAULT_LLM_MODEL = "gpt-4o" is hardcoded. Consider loading the default from a project config or environment variable, and document how users can override it.

yuecideng · 2026-06-29T15:58:13Z

+DEFAULT_PIPELINE_HISTORY = (
+    DEFAULT_ACTION_AGENT_WORKSPACE / "configs/pipeline_history.json"
+)
+DEFAULT_TASK_NAME = "Demo3_Text"


DEFAULT_TASK_NAME = "Demo3_Text" is demo-specific. This belongs in a demo script or CLI default, not in the library defaults module.

yuecideng · 2026-06-29T15:58:14Z

+    DEFAULT_ACTION_AGENT_WORKSPACE / "configs/pipeline_history.json"
+)
+DEFAULT_TASK_NAME = "Demo3_Text"
+DEFAULT_TASK_TEMPLATE_NAMES = frozenset({"Demo1_Text"})


DEFAULT_TASK_TEMPLATE_NAMES = frozenset({"Demo1_Text"}) is also demo-specific. Move to demo-level configuration.

yuecideng · 2026-06-29T15:58:16Z

+
+def _container_runtime_uid(container: _SceneObject) -> str:
+    base = _base_name(container)
+    if "basket" in base:


Special-casing "basket" → "wicker_basket" is a hidden convention. Document it or make it configurable per task.

yuecideng · 2026-06-29T15:58:17Z

+    "make_stacking_task_prompt",
+]
+
+_BASKET_LEFT_RELEASE_OFFSET_Y = 0.04


Physical constants (_BASKET_LEFT_RELEASE_OFFSET_Y = 0.04, _PLACE_LIFT_HEIGHT = 0.10, sample intervals) are scattered as module-level literals. Consider a @configclass for per-task physical parameters.

yuecideng · 2026-06-29T15:58:18Z

+    container_config: Mapping[str, Any] | None,
+) -> list[int]:
+    if axis == "y":
+        side_order = {"left": 0, "right": 1}


side_order = {"left": 0, "right": 1} hardcodes the dual-arm convention. This should come from the robot config or arm-slot mapping.

yuecideng · 2026-06-29T15:58:20Z

+    request_id: str = "prompt2geometry_asset_0"
+    output_name: str | None = None
+    zimage_base_url: str = ""
+    zimage_width: int = 1024


Image-generation defaults (zimage_width=1024, zimage_height=1024, zimage_seed=42, zimage_num_inference_steps=8) are hardcoded. They are overrideable via the dataclass, but the defaults should be documented or moved to a service config.

yuecideng · 2026-06-29T15:58:21Z

+        **_cfg_supported_kwargs(
+            AntipodalSamplerCfg,
+            {
+                "n_sample": int(runtime_kwargs.get("grasp_antipodal_n_sample", 20000)),


Grasp sampling defaults (n_sample=20000, max_length=0.088, min_length=0.003, viser_port=11801) are inline. While some are overrideable via runtime_kwargs, the defaults should be centralized in a config object.

yuecideng · 2026-06-29T15:58:23Z

+    )
+    z_offset = object_position[:, 2] - container_position[:, 2]
+    return (
+        (xy_distance <= float(spec.get("xy_radius", spec.get("radius", 0.1))))


Success tolerances (xy_radius default 0.1, min_z_offset -0.03, max_z_offset 0.25) are hardcoded. These should be task-configurable.

Copilot AI review requested due to automatic review settings June 15, 2026 09:05

Copilot started reviewing on behalf of skywhite1024 June 15, 2026 09:05 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

yuecideng requested review from XuanchaoPENG and yuecideng June 15, 2026 09:26

skywhite1024 force-pushed the ljd/action-agent-runtime-pr branch from 3cff8d8 to 7412fc1 Compare June 17, 2026 01:55

skywhite1024 changed the title ~~Add action-agent runtime for generated task configs~~ Add action-agent runtime and generated config pipeline Jun 17, 2026

yuecideng reviewed Jun 19, 2026

View reviewed changes

Comment thread embodichain/gen_sim/action_agent_pipeline/cli/generate_action_agent_config.py

yuecideng reviewed Jun 19, 2026

View reviewed changes

Comment thread embodichain/gen_sim/action_agent_pipeline/env_adapters/tableware/agent_env.py Outdated

skywhite1024 force-pushed the ljd/action-agent-runtime-pr branch from fb5165f to 6be53a6 Compare June 26, 2026 03:28

skywhite1024 added 15 commits June 27, 2026 18:29

feat: add action agent pipeline baseline

eda68b2

change config and image root

2f68e9d

update dexsim0.4.1

64b04e2

fix demo1 basket

adb29c8

fix: normalize mesh frame generation

f13f481

fix temp glb 90 but without material

0957ecb

fix normalizer glb

c2f4427

conda activate base

ccea47d

turn 180

direction right

3a1109d

fix one object error and robot high

52d9a75

fix affoardance

7dc0d84

fix: tighten action-agent atomic runtime schema

fb473a2

fix front and back

93ea3ea

fix lower -> open hand -> retreat

7807d67

fix: address action-agent runtime review cleanup

c607e30

skywhite1024 added 15 commits June 27, 2026 18:29

Add line arrangement config generation

bdd24df

Fix dual UR5 robot view semantics

e75e3f2

fix: adapt action-agent runtime to typed atomic actions

adcd3ba

style: format action-agent config generation test

c5a23a7

fix: pass grasp mesh data to typed affordance

0daf471

Native action-agent atomic actions

557ab2f

Native relative pose-sensitive release flow

59e5100

Fix pose-sensitive relative release height

f75c295

Use object-pose release for on placements

6ca279b

improve arrangement_spec

ed4ad8b

fix Camera high

4733e16

Object Manipulation update

f803dbc

fix Object Manipulation bug

21b45a8

change ur solver

5467036

fix(sim): export UR solver module API

bbc85e7

skywhite1024 force-pushed the ljd/action-agent-runtime-pr branch from 91e7811 to bbc85e7 Compare June 28, 2026 09:26

delete old action agent

f35defd

yuecideng requested changes Jun 29, 2026

View reviewed changes

yuecideng reviewed Jun 29, 2026

View reviewed changes

		if edge.get("left_arm_action") is None and edge.get("right_arm_action") is None:
		raise ValueError(f"Nominal edge '{edge_id}' must define an arm action.")

		def _pose(env, uid: str) -> torch.Tensor:
		return env.sim.get_rigid_object(uid).get_local_pose(to_matrix=True)

		)


		def _cleanup_output_root(output_root: Path, *, keep_path: Path \| None) -> None:

		)


		def _object_lifted(env, spec: Mapping[str, Any]) -> torch.Tensor:

		@@ -14,6 +14,6 @@
		# limitations under the License.
		# ----------------------------------------------------------------------------

Uh oh!

Conversation

skywhite1024 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Reviewer Cleanup

Type of change

Screenshots

Validation

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

XuanchaoPENG commented Jun 16, 2026

Uh oh!

skywhite1024 commented Jun 17, 2026

Uh oh!

Uh oh!

yuecideng commented Jun 19, 2026

Code review

Good design patterns

Points to improve

Uh oh!

Uh oh!

yuecideng commented Jun 19, 2026

Module structure: agents/hierarchy/ vs gen_sim/action_agent_pipeline/agents/

Redundancy map

External usage

Improvement options

Uh oh!

skywhite1024 commented Jun 24, 2026

Code review

Good design patterns

Points to improve

Uh oh!

skywhite1024 commented Jun 28, 2026

Module structure: agents/hierarchy/ vs gen_sim/action_agent_pipeline/agents/

Redundancy map

External usage

Improvement options

Uh oh!

yuecideng left a comment

Choose a reason for hiding this comment

Critical (must fix)

Important (should fix)

Conventions / minor

Recommendation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

skywhite1024 commented Jun 15, 2026 •

edited

Loading

Module structure: `agents/hierarchy/` vs `gen_sim/action_agent_pipeline/agents/`

Module structure: `agents/hierarchy/` vs `gen_sim/action_agent_pipeline/agents/`